Extracting features from text to improve statistical machine translation

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Text Segmentation Criteria for Statistical Machine Translation

For several reasons machine translation systems are today unsuited to process long texts in one shot. In particular, in statistical machine translation, heuristic search algorithms are employed whose level of approximation depends on the length of the input. Moreover, processing time can be a bottleneck with long sentences, whereas multiple text chunks can be quickly processed in parallel. Henc...

متن کامل

Optimizing Statistical Machine Translation for Text Simplification

Most recent sentence simplification systems use basic machine translation models to learn lexical and syntactic paraphrases from a manually simplified parallel corpus. These methods are limited by the quality and quantity of manually simplified corpora, which are expensive to build. In this paper, we conduct an indepth adaptation of statistical machine translation to perform text simplification...

متن کامل

Linguistic Input Features Improve Neural Machine Translation

Neural machine translation has recently achieved impressive results, while using little in the way of external linguistic information. In this paper we show that the strong learning capability of neural MT models does not make linguistic features redundant; they can be easily incorporated to provide further improvements in performance. We generalize the embedding layer of the encoder in the att...

متن کامل

Discourse-level features for statistical machine translation

The talk will show how the disambiguation of discourse connectives can improve their automatic translation. Connectives are a class of frequent functional lexical items that play an important role in text readability and coherence. Longer-range context is taken into account to learn the signaled rhetorical relations. The labels obtained from a discourse connective classifier are then integrated...

متن کامل

11,001 New Features for Statistical Machine Translation

We use the Margin Infused Relaxed Algorithm of Crammer et al. to add a large number of new features to two machine translation systems: the Hiero hierarchical phrasebased translation system and our syntax-based translation system. On a large-scale ChineseEnglish translation task, we obtain statistically significant improvements of +1.5 B and +1.1 B, respectively. We analyze the impact of ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Applied Linguistics and Lexicography

سال: 2019

ISSN: 2687-0215

DOI: 10.33910/2687-0215-2019-1-1-12-17